Factor Complexity of Accident Occurrence: an Empirical Demonstration Using Boosted Regression Trees

نویسنده

  • Yi-Shih Chung
چکیده

Factor complexity is regarded as a typical characteristic of traffic accidents. This paper proposes a novel method, named boosted regression trees (BRTs), which is particularly appropriate for investigating complicated and nonlinear relationships in high-variance traffic accident data. The Taiwan 2004–2005 single-motorcycle accident data are adopted to demonstrate the usefulness of BRTs. Traditional logistic regression and classification and regression tree (CART) models are also developed to compare their estimation results and predictive performance. Both the insample cross-validation and out-of-sample validation results show that the increase of tree complexity provides better but declining improvement on the predictive performance, indicating a limited factor complexity of single-motorcycle accidents. While a certain portion of fatal accidents can be explained by the main effects of crucial variables including geographical, time, and socio-demographic factors, the relatively unique fatal accidents are better approximated by interactive terms, especially the combinations of behavioral factors. The BRTs models generally provide better transferability than logistic and CART models. The implications of analysis results for devising safety policies are also provided.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Comparison of Supervised Learning Algorithms Using Different Performance Metrics

We present results from a large-scale empirical comparison between ten learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We evaluate the methods on binary classification problems using nine performance criteria: accuracy, squared error, cross-entropy, ROC Area, F-score, p...

متن کامل

An Empirical Evaluation of Supervised Learning for ROC Area

We present an empirical comparison of the AUC performance of seven supervised learning methods: SVMs, neural nets, decision trees, k-nearest neighbor, bagged trees, boosted trees, and boosted stumps. Overall, boosted trees have the best average AUC performance, followed by bagged trees, neural nets and SVMs. We then present an ensemble selection method that yields even better AUC. Ensembles are...

متن کامل

Evaluating Bayesian spatial methods for modelling species distributions with clumped and restricted occurrence data

Statistical approaches for inferring the spatial distribution of taxa (Species Distribution Models, SDMs) commonly rely on available occurrence data, which is often clumped and geographically restricted. Although available SDM methods address some of these factors, they could be more directly and accurately modelled using a spatially-explicit approach. Software to fit models with spatial autoco...

متن کامل

Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation

Researchers are increasingly using observational or nonrandomized data to estimate causal treatment effects. Essential to the production of high-quality evidence is the ability to reduce or minimize the confounding that frequently occurs in observational studies. When using the potential outcome framework to define causal treatment effects, one requires the potential outcome under each possible...

متن کامل

Boosted trees for ecological modeling and prediction.

Accurate prediction and explanation are fundamental objectives of statistical analysis, yet they seldom coincide. Boosted trees are a statistical learning method that attains both of these objectives for regression and classification analyses. They can deal with many types of response variables (numeric, categorical, and censored), loss functions (Gaussian, binomial, Poisson, and robust), and p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011